Skip to content

Add --deps-only flag to separate dependency fetching from source builds#1336

Open
sbryngelson wants to merge 12 commits intoMFlowCode:masterfrom
sbryngelson:nodeps
Open

Add --deps-only flag to separate dependency fetching from source builds#1336
sbryngelson wants to merge 12 commits intoMFlowCode:masterfrom
sbryngelson:nodeps

Conversation

@sbryngelson
Copy link
Copy Markdown
Member

@sbryngelson sbryngelson commented Mar 25, 2026

Summary

  • Adds --deps-only CLI flag to ./mfc.sh build that only fetches and builds dependencies (FFTW, HDF5, etc.) without building MFC source targets
  • Already-configured dependencies are skipped during regular builds, preventing network access in the source build step
  • Frontier and Frontier AMD CI now fetch deps on login nodes (internet access) and build source on compute nodes (no network needed)

Test plan

  • Verify ./mfc.sh build --deps-only -j 8 builds only dependency targets
  • Verify regular ./mfc.sh build -j 8 skips already-configured deps
  • Verify Frontier CI: deps fetched on login node, source built + tested on compute node
  • Verify Frontier AMD CI: same pattern via symlinked build.sh
  • Verify Phoenix CI is unaffected (no login-node build step)
  • Verify bench.yml Frontier/Frontier AMD jobs work with updated build_script args

This allows CI to fetch and build dependencies (FFTW, HDF5, etc.) on
login nodes with internet access, then build MFC source code on compute
nodes that may have no network connectivity.

Key changes:
- New --deps-only CLI flag for ./mfc.sh build
- Already-configured dependencies are skipped entirely during regular
  builds, guaranteeing no network access in the source build step
- Frontier and Frontier AMD now follow the pattern: deps on login node,
  source build + test on compute node
@sbryngelson sbryngelson marked this pull request as ready for review March 25, 2026 14:55
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

This pull request refactors the CI/CD build workflows to separate dependency fetching from source building. Changes include removing the bench argument from Frontier build commands in the benchmarking workflow, updating workflow step labels from "Setup & Build" to "Fetch Dependencies" and "Test" to "Build & Test", removing conditional build-directory checks to enable unconditional builds, simplifying the Frontier build script to only fetch dependencies without the run_bench parameter, and adding a new --deps-only flag to the Python build CLI command that resolves and builds only dependencies without building MFC targets. The toolchain's build module now includes early-exit logic for already-configured dependencies and a dedicated control flow for the deps_only mode.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description covers the main changes, motivation, and test plan, but is missing key required template sections like Type of change and Checklist items. Complete the required template sections: specify Type of change (refactor, new feature, etc.), check the Checklist boxes for tests/documentation updates, and provide explicit testing details for unchecked items.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding a --deps-only flag to separate dependency fetching from source builds, which is the core feature across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 47.42%. Comparing base (6120810) to head (7ab2744).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1336      +/-   ##
==========================================
+ Coverage   45.01%   47.42%   +2.41%     
==========================================
  Files          70       70              
  Lines       20562    19202    -1360     
  Branches     1962     1635     -327     
==========================================
- Hits         9255     9106     -149     
+ Misses      10179     9229     -950     
+ Partials     1128      867     -261     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

When __build_target recursively called build() to resolve sub-deps
(e.g. SILO depends on HDF5), the --deps-only guard intercepted
the recursive call and only built deps-of-deps (none for HDF5),
never building HDF5 itself. This caused SILO's configure to fail
with 'HDF5 was not found'.

Fix: only enter the --deps-only path on the top-level call
(history is empty), letting recursive calls proceed normally.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 25, 2026

Claude Code Review

Incremental review from: e836a1b
Head SHA: 7ab2744

New findings since last Claude review:

  • Frontier build parallelism regression (build.sh:43): The original test.sh used -j 1 for Frontier to work around a CCE 19.0.0 IPA SIGSEGV (comment: "Frontier Cray: -j 1 to work around CCE 19.0.0 IPA SIGSEGV"). The new build.sh unconditionally passes -j 8 for all clusters. If the CCE compiler bug is still present, Frontier builds will crash during parallel link/IPA, causing flaky CI.

  • --debug flag silently dropped for Frontier (test.sh:63): The old final test invocation included $debug_opts, which was --debug for Frontier ("for backtrace on build/runtime errors"). The new command drops it entirely. If intentional, a comment explaining the decision is absent; if accidental, Frontier runtime failures will lose backtraces.

sbryngelson and others added 10 commits March 25, 2026 15:10
Frontier/Frontier AMD now follow the same deps-on-login, source-on-compute
pattern for case optimization tests. Previously, prebuild-case-optimization.sh
built deps + source on the login node. Now:
- Login node: build.sh fetches deps via --deps-only
- Compute node: run_case_optimization.sh builds case-optimized binaries
  then runs them

Phoenix is unchanged (prebuild + run both in SLURM).
…se-opt builds

Remove -h ipa0, --debug, and -j 1 workarounds for Frontier Cray builds.
These did not address the root cause: corrupted intermediate files from
prior failed builds persisting on self-hosted CI runners and poisoning
subsequent compilations. Instead, clean stale hash-named MFC target
staging/install dirs before case-opt builds while preserving dependency
dirs (hipfort, fftw, etc.) that cannot be re-fetched without internet.
…d files

Add clean: false to benchmark checkout steps, matching the test workflow.
This prevents actions/checkout from failing on NFS silly-rename files
(.nfs*) left by still-running SLURM processes, and preserves .slurm_job_id
files so submit-slurm-job.sh can detect and cancel stale jobs on retry.
Split the combined Build & Test SLURM job into two sequential jobs so
each gets its own full time allocation. This fixes timeouts on Frontier
where build time (~20 min) was eating into the 1:59:00 test budget.

Also removes Frontier -j 1/--debug workarounds from the build step.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant